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In this note I provide simple and easily verifiable conditions under which a strong 
form of stochastic equicontinuity holds in a wide variety of modern time series models. 
In contrast to most results currently available in the literature, my methods avoid 
mixing conditions. I discuss two applications in detail. 

1. Introduction 

Stochastic equicontinuity typically captures the key difficulty in weak convergence proofs 
of estimators with non-differentiable objective functions. Precise and elegant methods 
have been found to deal with cases where the data dependence structure can be described 
by mixing conditions; see Dedecker et al. (2007) for an excellent summary. Mixing as- 
sumptions are convenient in this context because they measure how events generated by 
time series observations — rather than the observations themselves — relate to one another 
and therefore also measure dependence of functions of such time series. The downside 
to these assumptions is that they can be hard to verify for a given application. Hansen 
(1996) describes alternatives and considers parametric classes of functions that behave like 
mixingales, but his results come at the expense of Lipschitz continuity conditions on these 
functions and rule out many applications of interest. 

In this note I give simple and easily verifiable conditions under which objective functions 
of econometric estimators are stochastically equicontinuous when the underlying process 
is a stationary time series of the form 

& = ^(e i ,e i _i,e i _ 2 ,...)- ( L1 ) 
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Here (e«)gz is a sequence of iid copies of a random variable e and £ is a measurable, possibly 
unknown function that transforms the input (ei, £j_i, . . . ) into the output £j. The stochastic 
equicontinuity problem does not have to be parametric and no continuity conditions are 
needed. The class (1.1) allows for the construction of dependence measures that are directly 
related to the stochastic process and includes a large number of commonly-used stationary 
time series models. The next section provides several specific examples. 

In the following, \\X\\ p denotes (E|X| p ) 1//p and P* and E* are outer probability and 
expectation, respectively (see van der Vaart, 1998, p. 258). Limits are as n — > oo. 

2. Stochastic Equicontinuity in Nonlinear Time Series Models 

Let v n f := n -1 / 2 5^=1 — ^/(£o)) De the empirical process evaluated at some function 
/. Here / is a member of a class of real- valued functions 3\ In econometric applications, 3 
is typically a parametric class {fe : 9 G 0}, where 6 is a bounded subset of R k , although no 
parametric restriction on jF is necessary in the following. Define a norm by p(f) = ||/(£o) lb- 
An empirical process is said to be stochastically equicontinuous (see, e.g., Pollard, 1985, p. 
139) on J if for all e > and rj > 0, there is a 5 > such that 

limsupP*( sup | u n (f - g)\ > rj ) < e. (2.1) 

n^oo \f,geJ:p(f-g)<8 J 

As mentioned above, proving stochastic equicontinuity is often the key difficulty in weak 
convergence proofs. The next two examples illustrate typical applications. 

Example 1 (Quantilograms). Linton and Whang (2007) measure the directional predictive 
ability of stationary time series iXi)^ with the quantilogram, a normalized version of 
E(a - 1{X < 9 a })(a - \{X h < 9 a }) with a e (0, 1) and h — 1,2,..., where 9 a is the 
a-quantile of the marginal distribution of (Xi) ie z. Let ^ = (Xj_ ft ,Xj) and fe(£i) = (a — 
l{Xi„ h < 9})(a — l{Xi < 9}). Under the null hypothesis of no directional predictability, 

we have E/^ a (£o) = for all h — 1, 2, Let 9 n , a be the sample a-quantile and replace 

population moments by sample moments to obtain (n — h)^ 1 Y17=i+h fe n ^ ne sample 
version of E/e a (£o)- Apart from a scaling factor, the asymptotic null distribution of the 
sample quantilogram can be determined through the decomposition 

n 

in _ h) -i/2 f §n a &) = V^hEf L j£ ) + v n _ h f 6a + is n -h(fe n , a ~ fo a ). 

i=l+h 

If the distribution of Xj is smooth, the delta method can be used to control the first term 
on the right and, under dependence conditions, an ordinary central limit theorem applies to 
the second term. Further, we have p(f§ na — fe a ) whenever 9 n a — > p 9 a (see Example 
3 below). Hence, we can take J = {fe : 9 G ©}, where 6 is a compact neighborhood of 9 a , 
and as long as (2.1) holds, the third term on the right-hand side of the preceding display 
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converges to zero in probability because in large samples 

V(vn- h (fg n a ~ fe a ) > V, P(fe n „ - fej < s) < P* ( sup | u n _ h (f e - f e J\ > V ) . 

Example 2 (Robust M-estimators of location). Robust location estimators can often be 
defined implicitly as an M-estimator 9 n that nearly solves n~ l Y^=i fe(€i) = in the sense 
that J27=i fe (&) = p(^A^)• Popular examples include the median with f$(x) = sign(:r — 9) 
and Huber estimators with f e (x) = -Al{x-9 < —A} + (x — 0)l{\x — 6\ < A} + Al{x-9 > 
A} for some A > 0. Add and subtract in Yli=i fe (£») = p(V™) to see that stochastic 
equicontinuity implies y/nKf^ (£o) + v n fe = o p (l). The limiting behavior of \fn{9 n — 9o) 
can then again be determined through the delta method and a central limit theorem. 

Stochastic equicontinuity cannot hold without restrictions on the complexity of the set 3"; 
see, e.g., Andrews (1994, pp. 2252-2253). Here, complexity of is measured via its bracket- 
ing number iV = N(S, 3), the smallest number for which there are functions fi, . . . , fjy G J 
and functions bi, . . . , fe/v ( n ot necessarily in jF) such that p{bk) < 5 and \ f — fk\ < b^ for all 
1 < k < N. In addition, some restrictions are required on the memory of the time series. 
For processes of the form (1.1), the memory is most easily controlled by comparing £j to 
a slightly perturbed version of itself (see Wu, 2005). Let (e*)i e z be an iid copy of (£j)i e z, 
so that the difference between and £■ := ^(e i: . . . , e 1: e* , £!_ 1; . . . ) are the inputs prior to 
period 1. Assume the following: 

Assumption A. Let J be a uniformly bounded class of real-valued functions with brack- 
eting numbers N(S, 5F) < oo. Then there exists some a e (0, 1) and p > such that 

(i) Bup /6y ||/(e„)-/(Ollp = 0(a B ) and 

(ii) max 1 < jfc < JV (5,?) ||6fc(^n) ~ b k(C n )\\ P = 0(a n ) for any given 5 > 0. 

Remarks, (i) The examples at the end of this section show that Assumption A often rep- 
resents only a mild restriction on the dependence structure. 

(ii) Because jF is assumed to be uniformly bounded, the bounding functions b^ can be 
chosen to be bounded as well. Hence, in view of Lemma 2 of Wu and Min (2005), the exact 
choice of p in Assumption A is irrelevant, for if the assumption holds for some p, then it 
holds for all p > 0. 

Assumption A and a complexity requirement on jF given by a bracketing integral imply 
a strong form of stochastic equicontinuity. The following theorem (the proof of which is 
found in the Appendix) is similar to Andrews and Pollard's (1994) Theorem 2.2 with their 
mixing condition replaced by Assumption A. It implies (2.1) via the Markov inequality. 

Theorem. Suppose that Assumption A holds and x _7 ^ 2+7 ^ N(x, 3 r ) 1 ^ dx < oo for some 
7 > and an even integer Q > 2. Then for every e > 0, there is a 5 > such that 

limsupE*! sup \vn(f-g)\) < e. 

n^-oo \f,geJ:p(f-g)<5 J 
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A useful feature of this theorem is that the constants 7 and Q are not connected to the 
dependence measures as in Andrews and Pollard (1994). In contrast to their result for 
mixing arrays, 7 and Q can therefore be chosen to be as small and large, respectively, as 
desired to make the bracketing integral converge without restricting the set of time series 
under consideration. 

Before concluding this note, the next two examples illustrate how to apply the theorem 
and how to verify Assumption A in practice. 

Example 3 (Quantilograms, continued). Suppose for simplicity that F x (9) '■— P(AT < 9) 
is Lipschitz on 0. Take a grid of points min6 := 9 < 61 < ■ ■ ■ < 9 N =: max 6 and 
let = 1{X^ < 9 k } - < 9 k ^} + 1{X, < 9 k } - < 9 k ^}. Given a 

9 G 0, we can then find an index k such that \fe — fe k \ < b k , where I used the fact that 
\ol — 1{-}| < max{a, 1 — a} < 1. Moreover, by stationarity 

p(b k ) < 2\\1{X < 9 k } - 1{X < 6» fc _ 1 }|| 2 < 2 S /F x {0 k )-F x {0 k _ 1 ), 

which is bounded above by a constant multiple of \j9 k — 9 k _\ due to Lipschitz continuity. 
Hence, if p(b k ) < S for all k — 1, . . . , N, we can choose bracketing numbers with respect to 
p of order N(S, £F) = 0(5~ 2 ) as 5 — > (see Andrews and Pollard, 1994; van der Vaart, 1998, 
pp. 270-272) and the bracketing integral converges, e.g., for 7 = 1 and Q = 4. By the same 
calculations as in the preceding display, all 9,9' e 6 satisfy p(fe-fe') — 0(\9—9'\ 1 ^ 2 ) as 9 — > 
9' and therefore p(f§ na — fe a ) if 9 na — > p 9 a . In addition, suppose that the geometric 
contraction (GMC) property of Wu and Min (2005) holds, i.e., there is some (3 G (0,1) 
and p > such that ||£„ — £,' n \\ p = 0((3 n ). Then Assumption A(i) is also satisfied because 
Wfefo) - feiOl < 2\\l{X n <9}- l{X' n < 9}\\ p + 2\\l{X n _ h < 9} - l{X' n _ h < 9}\\ p = 
0(a n ) uniformly in 9 for some a G (0,1) by Proposition 3.1 of Hagemann (2011). The 
GMC property holds, e.g., for stationary (causal) ARMA, ARCH, GARCH, ARMA-ARCH, 
ARMA-GARCH, asymmetric GARCH, generalized random coefficient autoregressive, and 
quantile autoregressive models; see Shao and Wu (2007) and Shao (2011) for proofs and 
more examples. All of these models therefore also satisfy Assumption A(i). The same 
reasoning applies to b k . 

Example 4 (Robust M-estimators of location, continued). Nearly identical arguments as 
in the preceding example yield stochastic equicontinuity for the median. For the Huber 
estimator, take the grid from before and note that we can find a k such that \ fe — fe k \ < 
min{0fc — 2A} =: b k . A routine argument (Andrews and Pollard, 1994; van der Vaart, 
1998, Example 19.7, pp. 270-271) yields bracketing numbers of order N{5,3 r ) = 0(5" 1 ) as 
5 — > 0; the bracketing integral is finite, e.g., for 7 = 1 and Q = 2. Assumption A(i) can be 
verified via the bound sup eee \\fe(£n) - MOIIp < Hn - CiWp and (") nolds trivially. 
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Appendix 



A. Proofs 

Proof of the Theorem. This follows from a simple modification of Andrews and Pollard's 
(1994) proof of their Theorem 2.1. The proof requires three steps: (i) Their "Proof of 
inequality (3.2)," (ii) their "Proof of inequality (3.3)," and (iii) their "Comparison of pairs" 
argument. Replace their % with k and their r{hi) with r(b k ); then apply the Lemma below 
instead of Andrews and Pollard's (1994) Lemma 3.1 in the derivation of their inequality 
(3.5) to deduce || maxi<fc<jv | ^n&fclllg < C'N 1 ^ max{n _1//2 , maxi<fc<N T~(b k )} and use this 
in (i) instead of their inequality (3.5). Another application of the Lemma establishes the 
required analogue of their inequality (3.5) used in (ii). The same inequality can also be 
applied in (iii). The other arguments remain valid without changes. □ 

Lemma. Let r(f) := p(/) 2 ^ 2+7 - ) for some 7 > and suppose that Assumption A holds. 
For all nGN, all f,g e 3, and every even integer Q > 2 we have 

E| u n (f -g)\ Q < n-^c((r(f - gfn) +■■■+ (r(f - gfn) Q/2 ), 

where C depends only on Q, 7, and a. The inequality remains valid when f — g is replaced 
by bk for any given k > 1. 

Proof of the Lemma. Let Z{i) := /(&) — E/(£o) — (<?(&) ~~ Eg(£o))- Assume without loss 
of generality that \Z(i)\ < 1 for all % > 1; otherwise rescale and redefine C. Define 
Z'{%) = /(£) - E/(f ) - (g(&) ~ %(&)) and note that EZ{i) = EZ'(i) = for all % e Z 
and all f,g G "5 because and ^ are identically distributed. For fixed k > 2, d > 1, and 
1 < m < k, consider integers ^ < • • • < i m < i m +i so that i m+ i — i m = d. Since 

Z(i) and are stationary, repeatedly add and subtract to see that 



EZ(h)Z(i 2 ) • • • Z(i fc ) - EZ(h)Z(i 2 ) ■ ■ ■ Z(i m )EZ(i m+1 ) ■ ■ ■ Z(i k ) 
EZ(ii - i m )Z(i 2 - i m ) ■ ■ ■ Z(i k - i rn ) 
- EZ(ix - i m )Z(i 2 -i m )--- Z(0)EZ(d) ■ ■ ■ Z(i k - i m ) 



< 



EZ(i x -i m )--- Z(0) (Z(d) - Z'(d)) Z(i m+2 - i m ) ■ ■ ■ Z(i k - i r 



k—m—l 



+ \EZ(i 1 -i m )---Z(0)Z'(d) x 



i=2 



+ 



X [Z (i m +j im) Z (i m _|_j im)) ' ' ° Z(i k im) 
EZ(h -i m )--- Z(0)Z'(d) ■ ■ ■ Z\i k - i m ) 
- EZ(H -i m )--- Z(0)EZ(d) ■ ■ ■ Z(l k - Im) 



(A.l) 
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In particular, the last summand on the right-hand side is zero because Z{%\ — i m ) ■ ■ ■ Z(0) 
and Z'(d) ■ ■ ■ Z r (ik — i rn ) are independent and Z(d) ■ ■ ■ Z{i k — i m ) and Z'(d) ■ ■ ■ Z'{ik — i m ) 
are identically distributed. For a large enough M > and some s > 1, Assumption A(i) 
and distributional equivalence of Z(d) and Z'(d) imply \\Z(d)-Z'(d)\\ s < \\f{id)-f{Q\\ s + 
\Wd) ~ 9{Q\\s < 2sup /eJ ||/(&) - /0| s < Ma d . Hdlder's inequality then bounds the 
first term on the right-hand side of the preceding display by 

\\Z(h) ■ ■ ■ Z(i m )\\ p \\Z(i m+2 ) ■ ■ ■ Z(t k )\\ q Ma d , (A.2) 

where the reciprocals of p, q, and s sum to 1. Proceeding similarly to Andrews and Pollard 
(1994), another application of the Holder inequality yields 

, m v l/(mp) 

\\Z(i 1 )...Z(i m )\\ p <mE\Z(i j )r) <T{f-gf+M* 

\j=l / 

whenever mp > 2 and similarly \\Z(i m+ 2) ■ ■ ■ Z(ik)\\ q < r(f — g)^ 2+ ^l q whenever {k — 
in — l)q > 2. Suppose for now that k > 3. If k > m + 1, take s = (j + Q)/j and 
mp = (A; — m — l)g = (A: — 1)/(1 — 1/s). Decrease the resulting exponent of r(f — g) from 
Q(2 + i)/(Q + 7) to 2 so (A.2) is bounded by Ma d r(f - gf. If A; > 2 and k = m + 1, the 
factor ||Z(i m+2 ) • • • is not present in (A.2), but we can still choose s = (7 + Q)/j 

and mp = (k — 1)/(1 — to obtain the same bound. Identical arguments also apply to 
each of the other summands in (A.l). Hence, we can find some M' > so that 

|EZ(i!)Z(i 2 ) • • • Z(i fc )| < |EZ(i!)Z(i 2 ) • • • Z(i m )EZ(i m+1 ) ■ ■ ■ Z(i k )\ + M'a d r(f - gf. 

Here M' in fact depends on k, but this does not disturb any of the subsequent steps. 

Now replace (A.2) in Andrews and Pollard (1994) by the inequality in the preceding 
display. In particular, replace their 8a(d) 1 ^ s with M'a d and their r 2 with r(/ — g) 2 . 
The rest of their arguments now go through without changes. The desired result for b k 
follows mutatis mutandis: Simply define Z{%) = bk(£,i), repeat the above steps, and invoke 
Assumption A(ii) in place of Assumption A(i). □ 
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